DAOS-18859: Test Fix#18321
Draft
knard38 wants to merge 2 commits into
Draft
Conversation
After commit 8f3ac4a switched daos_eq_poll from DAOS_EQ_WAIT to DAOS_EQ_NOWAIT, two bugs were introduced in kv_put and kv_get: 1. Stale evp dereference on poll failure: when the inner spin loop exits with rc < 0 (poll error), evp is NOT updated by daos_eq_poll. The code then falls through to access evp->ev_error and call daos_kv_put/daos_kv_get with the stale pointer, which may point to an event still in-flight. This corrupts DAOS internal state and causes a SIGSEGV inside libdaos.so. Fix: add an explicit 'if (rc < 0) break;' guard after the inner spin loop, mirroring the original DAOS_EQ_WAIT code that had 'if (rc < 0) break;' as the first check after polling. 2. Missing ev_error check in kv_put drain loop: the new NOWAIT-based drain loop stopped checking evp->ev_error for each drained event, silently ignoring I/O errors that occurred on in-flight requests. The original DAOS_EQ_WAIT loop checked 'rc = evp->ev_error' on every completion. Fix: restore the ev_error check in the drain loop. Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@hpe.com>
Add D_ERROR + fprintf(stderr) diagnostic messages in kv_put() and kv_get() that fire when daos_eq_poll() returns rc < 0 (poll error) and the fix prevents the stale evp dereference. With the fix in place, the messages confirm the condition was caught and handled safely — the code breaks out without dereferencing the stale pointer, and the error propagates cleanly. Quick-Functional: true Test-repeat: 5 Test-tag: PoolAutotestTest,test_pool_autotest Signed-off-by: Cedric Koch-Hofer <cedric.koch-hofer@hpe.com>
|
Errors are component not formatted correctly,Ticket number suffix is not a number. See https://daosio.atlassian.net/wiki/spaces/DC/pages/11133911069/Commit+Comments,Unable to load ticket data |
Collaborator
|
Test stage Functional Hardware Medium MD on SSD completed with status FAILURE. https://jenkins-3.daos.hpc.amslabs.hpecorp.net//job/daos-stack/job/daos/view/change-requests/job/PR-18321/1/execution/node/740/log |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Steps for the author:
After all prior steps are complete: